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Abstract 

A statistician designing an experiment wants to get as much infor- 
mation as possible from the data gathered. Often this means the most 
precise estimate possible (that is, an estimate with minimum possible 
variance) of the unknown parameters. If there are several parame- 
ters, this can be interpreted in many ways: do we want to minimize 
the average variance, or the maximum variance, or the volume of a 
confidence region for the parameters? 
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In the case of block designs, these optimahty criteria can be cal- 
culated from the concurrence graph of the design, and in many cases 
from its Laplacian eigenvalues. The Levi graph can also be used. The 
various criteria turn out to be closely connected with other proper- 
ties of the graph as a network, such as number of spanning trees, 
isoperimetric number, and the sum of the resistances between pairs of 
vertices when the graph is regarded as an electrical network. 

In this chapter, we discuss the notions of optimality for incomplete- 
block designs, explain the graph-theoretic connections, and prove some 
old and new results about optimality. 

1 What makes an incomplete-block design good 
for experiments? 

Experiments are designed in many ways: for example, Latin squares, block 
designs, split-plot designs. Combinatorialists, on the other hand, have a 
much more specialized usage of the term "design", as we remark later. We 
are concerned here with incomplete-block designs, more special than the 
statistician's designs and more general than the mathematician's. 

To a statistician, a block design has two components. There is an under- 
lying set of experimental units, partitioned into b blocks of size k. There is a 
further set of v treatments, and also a function / from units to treatments, 
specifying which treatment is allocated to which experimental unit; that is, 
f{u)) is the treatment allocated to experimental unit u. Thus each block 
defines a subset, or maybe a multi-subset, of the treatments. 

In a complete-block design, we have k = v and each treatment occurs once 
in every block. Here we assume that blocks are incomplete in the sense that 
k < V. 

We assume that the purpose of the experiment is to find out about the 
treatments, and differences between them. The blocks are an unavoidable 
nuisance, an inherent feature of the experimental units. In an agricultural 
experiment the experimental units may be field plots and the blocks may 
be fields or plough-lines; in a clinical trial the experimental units may be 
patients and the blocks hospitals; in process engineering the experimental 
units may be runs of a machine that is recalibrated each day and the blocks 
days. See for further examples. 

In all of these situations, the values of b, k and v are given. Given 
these values, not all incomplete-block designs are equally good. This chapter 
describes some criteria that can be used to choose between them. 

For example. Fig. [T] shows two block designs with v = 15, b = 7 and 
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Figure 1: Two block designs with v = 15, b = 7 and k = 3 

k = 3. We use the convention that the treatments are labelled 1, . . . , v, that 
columns represent blocks, and that the order of the entries in each column 
is not significant. Where necessary, we use the notation Tj to refer to the 
block which is shown as the jth column, for j = 1, . . . , b. 

The replication rj of treatment i is defined to be which is the 

number of experimental units to which it is allocated. For the design in 
Fig. [D^a), Vi G {1,2} for all i. As we see later, statisticians tend to prefer 
designs in which all the replications are as equal as possible. If = rj for 
1 i < j ^ V then the design is equireplicate: then the common value of rj 
is usually written as r, and vr = bk. 

The design in Fig. [T](b) is a queen-bee design because there is (at least) 
one treatment that occurs in every block. Scientists tend to prefer such 
designs because they have been taught to compare every treatment to one 
distinguished treatment, which may be called a control treatment. 
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Figure 2: Two block designs with v = 5, b = 7 and k = 3 

Fig. |5] shows two block designs with v = 5, b = 7 and k = 3. The design 
in Fig. El^b) shows a new feature: treatment 1 occurs on two experimental 
units in block Fi. A block design is binary if /(a) 7^ f{^) whenever a and u 
are experimental units in the same block. The design in Fig. |2]^a) is binary. 
It seems to be obvious that binary designs must be better than non-binary 
ones, but we shall see later that this is not necessarily so. However, if there 
is any block on which / is constant, then that block provides no information 
about treatments, so we assume from now on that there are no such blocks. 
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Figure 3: Two block designs with v = 7, b = 7 and k = 3 



Fig. [3] shows two equirephcate binary block designs with v = 7, b = 7 
and k = 3. A binary design is balanced if every pair of distinct treatments 
occurs together in the same number of blocks. If that number is A, then 
r{k — 1) = [v — 1)X. Such designs are also called 2 -designs or BIBDs. The 
design in Fig. El^a) is balanced with A = 1; the design in Fig. E^b) is not 
balanced. 

Pure mathematicians usually assume that, if they exist, balanced designs 
are better than non-balanced ones. (Indeed, many do not call a structure a 
'design' unless it is balanced.) As we shall show in Section |4?T| this assump- 
tion is correct for all the criteria considered here. However, for given values 
of V and k, a non-balanced design with a larger value of b may produce more 
information than a balanced design with a smaller value of b. 



2 Graphs from block designs 
2.1 The Levi graph 

A simple way of representing a block design is its Levi graph, or incidence 
graph, introduced in [4Dj. This graph has v + b vertices, one for each block 
and one for each treatment. There are bk edges, one for each experimental 
unit. If experimental unit u is in block j and /(w) = i, then the correspond- 
ing edge joins vertices i and j. Thus the graph is bipartite, with one 
part consisting of block vertices and the other part consisting of treatment 
vertices. Moreover, the graph has multiple edges if the design is not binary. 
Fig. m gives the Levi graph of the design in Fig. Et^b). 

We regard two block designs as the same if one can be obtained from 
the other by permuting the experimental units within each block. Since the 
vertices of the Levi graph are labelled, there is a bijection between block 
designs and their Levi graphs. 

Let riij be the number of edges from treatment- vertex i to block- vertex j; 
that is, treatment i occurs on riij experimental units in block j. The v x b 
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Figure 4: The Levi graph of the design in Fig. EJ^b) 



matrix N whose entries are the rijj is the incidence matrix of the block design. 
If the rows and columns of N are labelled, we can recover the block design 
from its incidence matrix. 

2.2 The concurrence graph 

In a binary design, the concurrence Xij of treatments i and j is if i = j 
and otherwise is the number of blocks in which i and j both occur. For non- 
binary designs we have to count the number of occurrences of the pair {i,j} 
in blocks according to multiplicity, so that Xij is the (i, j)-entry of A, where 
A = NN^. The matrix A is called the concurrence matrix of the design. 

The concurrence graph of the design has the treatments as vertices. There 
are no loops. If 2 7^ j, then there are Xij edges between vertices i and j. Each 
such edge corresponds to a pair {a,oj} of experimental units in the same 
block, with f{a) = i and f{uj) = j: we denote this edge by Cq,^. (This edge 
does not join the experimental units a and u; it joins the treatments applied 
to these units.) It follows that the degree di of vertex i is given by 



Figs. |5] and |6] show the concurrence graphs of the designs in Figs. [T] and [21 
respectively. 

If /c = 2, then the concurrence graph is effectively the same as the block 
design. Although the block design cannot be recovered from the concurrence 
graph for larger values of k, we shall see in Section ^72\ that the concurrence 
graphs contain enough information to decide between two block designs on 
any of the usual statistical criteria. They were introduced as variety concur- 
rence graphs in |13], but are so useful that they may have been considered 
earlier. 
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2.3 The Laplacian matrix of a graph 

Let H be an arbitrary graph with n vertices: it may have multiple edges, but 
no loops. The Laplacian matrix L of if is defined to be the square matrix 
with rows and columns indexed by the vertices of H whose (i, i)-entry La 
is the valency of vertex i and whose (i,j)-entry Lij is the negative of the 
number of edges between vertices i and j iii ^ j. Then La = X^j^j ^ij 
1 < i < n, and so the row sums of L are all zero. It follows that L has 
eigenvalue on the all-1 vector; this is called the trivial eigenvalue of L. We 
show below that the multiplicity of the zero eigenvalue is equal to the number 
of connected components of H. Thus the multiplicity is 1 if and only if H is 
connected. 

Call the remaining eigenvalues of L non-trivial. They are all non-negative, 
as we show in the following theorem (see |7]). 

Theorem 1 (a) IfL is a Laplacian matrix, then L is positive semi- definite, 
(b) IfL is a Laplacian matrix of order n and x is any vector in MP, then 



edges ij 

(c) //Li and L2 are the Laplacian matrices of graphs Hi and H2 with the 
same vertices, and if H2 is obtained from Hi by inserting one extra 
edge, then L2 — Li is positive semi-definite. 

(d) If Li is the Laplacian matrix of the graph H, then the multiplicity of the 
zero eigenvalue of L is equal to the number of connected components 
ofH. 

Proof Each edge between vertices i and j defines a f x t> matrix whose 
entries are all apart from the following submatrix: 



« 3 
1 -1 
-1 1 



The Laplacian is the sum of these matrices, which are all positive semi- 
definite. This proves (a), (b) and (c). 

From (b), the vector x is in the null space of the Laplacian if and only 
if X takes the same value on both vertices of each edge, which happens if 
and only if it takes a constant value on each connected component. This 
proves (d). ■ 
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Theorem [T] shows that the smallest non-trivial eigenvalue of a connected 
graph is positive. This eigenvalue is sometimes called the algebraic connec- 
tivity of the graph. The statistical importance of this is shown in Section [3l2l 

In Section [37T] we shall need the Moore-Penrose generalized inverse of L~ 
of L (see |15]). Put Pq = n~^J„, where is the n x n matrix whose entries 
are all 1, so that Pq is the matrix of orthogonal projection onto the space 
spanned by the all-1 vector. If H is connected then L + Pq is invertible, and 

L- = (L + Po)-'-Po, 
so that LL~ = L~L = I„ — Pq, where I„ is the identity matrix of order n. 

2.4 Laplacians of the concurrence and Levi graphs 

There is a relationship between the Laplacian matrices of the concurrence 
and Levi graphs of a block design A. Let N be the incidence matrix of 
the design, and R the diagonal matrix (with rows and columns indexed by 
treatments) whose {i, i) entry is the replication of treatment i. If the design 
is equireplicate, then R = rl^, where r is the replication number. 

For the remainder of the paper, we will use L for the Laplacian matrix of 
the concurrence graph G of A, and L for the Laplacian matrix of the Levi 
graph G of A. Then it is straightforward to show that 

L = A;R-NN^, L = 

The Levi graph is connected if and only if the concurrence graph is con- 
nected; thus is a simple eigenvalue of L if and only if it is a simple eigenvalue 
of L, which in turn occurs if and only if all contrasts between treatment pa- 
rameters are estimable (see Section l3TT]) . A block design with this property 
is itself called connected: we consider only connected block designs. 

In the equireplicate case, the above expressions for L and L give a relation- 
ship between their Laplacian eigenvalues, as follows. Let x be an eigenvector 
of L with eigenvalue ^ rk. Then, for each of the two solutions 6 of the 
quadratic equation 

there is a unique vector z in such that [ ]^ is an eigenvector of L 

with eigenvalue 6. Conversely, any eigenvalue 6 k of L arises in this way. 

The Laplacian matrices of the concurrence graphs in Fig. [6] are shown in 
Table E 



R -N 

-N^ kl 
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3 Statistical issues 



3.1 Estimation and variance 

As part of the experiment, we measure the response Yi^ on each experimental 
unit u. If u is in block F, then we assume that 



here, Tj is a constant depending on treatment i, /3r is a constant depending 
on block r, and random variable with expectation and variance a . 

Furthermore, if a ^ u, then Sa and Euj are uncorrelated. 

It is clear that we can add a constant to every block parameter, and sub- 
tract that constant from every treatment parameter, without changing ([2]). 
It is therefore impossible to estimate the individual treatment parameters. 
However, if the design is connected, then we can estimate all contrasts in the 
treatment parameters: that is, all linear combinations of the form ^jXjTi 
for which Yli = 0- particular, we can estimate all the simple treatment 
differences Tj — Tj. 

An estimator is a function of the responses Y^, so it is itself a random 
variable. An estimator of a value is unbiased if its expectation is equal to 
the true value; it is linear if it is a linear function of the responses. Amongst 
linear unbiased estimators, the best one (the so-called BLUE), is the one with 
the least variance. Let Vij be the variance of the BLUE for — tj. 

If all the experimental units form a single block, then the BLUE of ti — T2 
is just the difference between the average responses for treatments 1 and 2. 
It follows that 



When V = 2, this variance is minimized (for a given number of experimental 
units) when ri = r2. Moreover, if the responses are normally distributed 
then the length of the 95% confidence interval for ti — T2 is proportional 
to t(ri + r2 — 2, 0.975)-\/Vi2, where t{d,p) is the lOOp-th percentile of the 
t distribution on d degrees of freedom. The smaller the confidence interval, 
the more likely is our estimate to be close to the true value. This length can 
be made smaller by increasing ri -fr2, decreasing |ri — r2|, or decreasing o"^. 

However, matters are not so simple when k < v and v > 2. The following 
result can be found in any statistical textbook about block designs (see the 
section on further reading for recommendations). 

Theorem 2 Let L be the Laplacian matrix of the concurrence graph of a 
connected block design. If ^jXj = 0, then the variance of the BLUE of 



(2) 
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Yli^i'^i equal to (x^L x^ka"^. In particular, the variance Vij of the BLUE 
of the simple difference ti — Tj is given by Vij = {L~^ + L^- — 2L~) ka"^ . 

3.2 Optimality criteria 

We want all of the Vij to be as small as possible, but this is a multi- 
dimensional problem if f > 2. Let V be the average of the variances Vij 
over all treatments z, j with i ^ j. Theorem [2] shows that, for each fixed 

E^^^- = + 

= [{v-l)LT, + {i:T{l.-)-LT^ + 2LT^ka' 
= [vLT,+i:r{l.-)]ka\ 

because the row sums and column sums of L are all 0. It follows that V = 
2ka'^TT{L-)/{v-l). 

Let 6i, . . . , 6y_i be the non-trivial eigenvalues of L, now listed according 
to multiplicity and in non-decreasing order. Then 

Tr(L-) = l + ... + 

t/l t/y-l 

and so 

V = 2ka^ X 



harmonic mean of 6'i, . . . , 9^_i 

A block design is defined to be A-optimal (in some given class of designs 
with the same values of 6, k and v) if it minimizes the value of V\ here 'A' 
stands for 'average'. Thus a design is A-optimal if and only if it maximizes 
the harmonic mean of 6'i, . . . , 

For V > 2, the generalization of a confidence interval is a confidence 
ellipsoid centered at the point (fi, . . . , f^) which gives the estimated value of 
(ti, . . . , r^,) in the {y — l)-dimensional subspace of W for which ^ Tj = 0. A 
block design is called D-optimal if it minimizes the volume of this confidence 
eUipsoid. Since this volume is proportional to A/det(L^ + Pq), a design is 
D-optimal if and only if it maximizes the geometric mean of ^i, . . . , 6*^-1. 
Here 'D' stands for 'determinant'. 

Rather than looking at averages, we might consider the worst case. If all 
the entries in the vector x are multiplied by a constant c, then the variance 
of the estimator of ^ multiplied by c^. Thus, those contrast vectors x 

which give the largest variance relative to their own length are those which 
maximize x'''L~x/x'''x; these are precisely the eigenvectors of L with eigen- 
value 6i. A design is defined to be E- optimal if it maximizes the value of 6i] 
here 'E' stands for 'extreme'. 
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More generally, for p in (0, oo), a design is called ^p-optimal if it minimizes 



Thus A-optimality corresponds to p = 1, D-optimality corresponds to the 
limit as p — )■ 0, and E-optimality corresponds to the limit as p — t- oo. 

Let Li and L2 be the Laplacian matrices of the concurrence graphs of 
block designs Ai and A2 for v treatments in blocks of size k. If L2 — Li is 
positive semi-definite, then A2 is at least as good as Ai on all the $p-criteria. 
Theorem [T](c) shows that adding an extra block to a design cannot decrease 
its performance on any $p-criterion. 

There are even more general classes of optimality criteria (see |2B] and 
for details). Here we concentrate on A-, D- and E-optimality. 

3.3 Questions and an example 

A first obvious question to ask is: do these criteria agree with each other? 

Our optimality properties are all functions of the concurrence graph. 
What features of this graph should we look for if we are searching for opti- 
mal, or near-optimal, designs? Symmetry? (Nearly) equal degrees? (Nearly) 
equal numbers of edges between pairs of vertices? Distance-regularity? Large 
girth (ignoring cycles within a block)? Small numbers of short cycles (ditto)? 
High connectivity? Non-trivial automorphism group? 

Is it more useful to look at the Levi graph rather than the concurrence 
graph? 

Example 1 Fig. [7] shows the values of the A- and D-criteria for all equi- 
replicate block designs with v = 8, b = 12 and k = 2: of course, these are 
just regular graphs with 8 vertices and degree 3. The harmonic mean is 
shown on the A-axis, and the geometric mean on the D-axis. (Note that this 
figure includes some designs that were omitted from Figure 3 of [1].) The 
rankings on these two criteria are not exactly the same, but they do agree 
at the top end, where it matters. The second-best graph on both criteria 
is the cube; the best is the Mobius ladder, whose vertices are the elements 
of Zg and whose edges are {i,i + 1} and {i,i + 4} for i in Zg. These two 
graphs are so close on both criteria that, for practical purposes, they can be 
regarded as equally good. 

The plotting symbols show the edge-connectivity of the graphs: edge- 
connectivity 3, 2, 1 is shown as x, -|-, o respectively. This does suggest that 
the higher the edge-connectivity the better is the design on the A- and D- 
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Figure 7: Values of two optimality criteria for all equireplicate block designs 
with f = 8, 6 = 12, and k = 2, and for i^2,6 

criteria. This is intuitively reasonable: if A; = 2, then the edge-connectivity 
is the minimum number of blocks whose removal disconnects the design. In 
this context, it has been called breakdown number: see [3S]. 

The four graphs with edge-connectivity 3 have no double edges, so con- 
currences differ by at most 1. The only other regular graph with no double 
edges is ranked eighth (amongst regular graphs) by the A-criterion. This 
suggests that (near-) equality of concurrences is not sufficient to give a good 
design. 

The symbol • shows the non- regular graph -ft'2,6) which also has eight 
vertices and twelve edges. It is not as good as the regular graphs with edge- 
connectivity 3, but it beats many of the other regular graphs. 

This pattern is typical of the block designs investigated by statisticians 
for most of the 20th century. The A- and D-criteria agree closely at the top 
end. High edge- connectivity appears to show good designs. Many of the best 
designs have a high degree of symmetry. 

4 Highly patterned block designs 

4.1 Balanced incomplete-block designs 

BIBDs are intuitively appealing, as they seem to give equal weight to all 
treatment comparisons. They were introduced for agricultural experiments 



13 



by Yates in 

In [38], Kshirsagar proved that, if there exists a BIBD for given values 
of V, b and k, then it is A-optimaL Kiefer generahzed this in [35] to cover 
$p-optimahty for all p in (0, oo), including the limiting cases of D- and E- 
optimality. The core of Kiefer's proof is as follows: binary designs maximize 
Tr(L), which is equal to Yl^=i ^i'l fixed value T of this sum of positive 

numbers, Y20^^ is minimized at [T/{v — 1)]^^ when 6i = ■■■ = O^^i = 
T/{v — 1); and is minimized when T is maximized. 

4.2 Other special designs 

Of course, it frequently occurs that the values of b,v,k available for an ex- 
periment are such that no BIBD exists. (Necessary conditions for the exist- 
ence of a BIBD include the well-known divisibility conditions v \ bk and 
v{v — 1) I bk{k — 1), which follow from the elementary results in Section [1], 
and Fisher's inequality asserting that b > v.) 

In the absence of a BIBD, various other special types of design have been 
considered, and some of these have been proved optimal. Here is a short 
sample. 

A design is group- divisible if the treatments can be partitioned into "groups" 
all of the same size, so that the number of blocks containing two treatments is 
Ai if they belong to the same group and A2 otherwise. Cheng [THl [Uj showed 
that if there is a group-divisible design with two groups and A2 = Ai -|- 1 in 
the class of designs with given v, b, k, then it is $p-optimal for all p, and in 
particular is A-, D- and E-optimal. 

A regular-graph design is a binary equireplicate design with two possible 
concurrences A and A-l-1. It is easily proved that, in such a design, the number 
of treatments lying in A -|- 1 blocks with a given treatment is constant; so the 
graph H whose vertices are the treatments, two vertices joined if they lie in 
A-l-1 blocks, is regular. 

Now Cheng [17] showed that a group-divisible design with A2 = Ai + 1 = 1, 
if one exists, is $p-optimal in the class of regular-graph designs for all p. 
Cheng and Bailey [20] showed that a regular-graph design for which the 
graph is strongly regular (see [13]) and which has singular concurrence matrix 
is •I'p-optimal, for all p, among binary equireplicate designs with given v, b, 
k. 

Designs with the property described here are particular examples of par- 
tially balanced designs with respect to an association scheme: see Bailey [3]. 

Another class which has turns out to be optimal in many cases, but 
whose definition is less combinatorial, consists of the variance-balanced de- 
signs, which we consider later in the chapter. 
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5 Graph concepts linked to D-optimality 



5.1 Spanning trees of the concurrence graph 

Let G be the concurrence graph of a connected block design, and let L be 
its Laplacian matrix. A spanning tree for G is a spanning subgraph which is 
a tree. Kirchhoff 's famous Matrix-tree theorem in [36] states the following: 

Theorem 3 If G is a connected graph with v vertices and Laplacian ma- 
trix L, then the product of the non-trivial eigenvalues of L is equal to v 
multiplied by the number of spanning trees for G. 

Thus we have a test for D-optimality: 

A design is D-optimal if and only if its concurrence graph has the 
maximal number of spanning trees. 

Note that Theorem [3] gives an easy proof of Cayley's theorem on the num- 
ber of spanning trees for the complete graph Ky. The non-trivial eigenvalues 
of its Laplacian matrix are all equal to f , so Theorem [3] shows that it has 
v""'"^ spanning trees. 

If G is sparse, it may be much easier to count the number of spanning 
trees than to compute the eigenvalues of L. For example, if G has a single 
cycle, which has length s, then the number of spanning trees is s, irrespective 
of the remaining edges in G. 

In the context of optimal block designs, Gaffke discovered the importance 
of Kirchhoff 's theorem in [26]. Cheng followed this up in papers such as 
[TB| [T7t \W\ . Particularly intriguing is the following theorem from [2T] . 

Theorem 4 Consider block designs with k = 2 (connected graphs). For each 
given v there is a threshold bo such that if b > bo then any D-optimal design 
for V treatments in b blocks of size 2 is nearly balanced in the sense that 

• no pair of replications differ by more than 1; 

• for each fixed i, no pair of concurrences Xij differ by more than 1. 

In fact, there is no known example with bo > v — 1, which is the minimal 
number of blocks required for connectivity. 
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5.2 Spanning trees of the Levi graph 

In [27] Gaffke stated the following relationship between the numbers of span- 
ning trees in the concurrence graph and the Levi graph. 

Theorem 5 Let G and G be the concurrence graph and Levi graph for a 
connected incomplete-block design forv treatments in b blocks of size k. Then 
the number of spanning trees for G is equal to k^~^^^ times the number of 
spanning trees for G. 

Thus, an alternative test for D-optimality is to count the number of span- 
ning trees in the Levi graph. For binary designs, the Levi graph has fewer 
edges than the concurrence graph if and only if /c > 4. 

6 Graph concepts linked to A-optimality 

6.1 The concurrence graph as an electrical network 

We can consider the concurrence graph G as an electrical network with a 
1-ohm resistance in each edge. Connect a 1-volt battery between vertices i 
and j. Then current flows in the network, according to these rules. 

Ohm's Law: In every edge, the voltage drop is the product of the current 
and the resistance. 

Kirchhoff 's Voltage Law: The total voltage drop from one vertex to any 
other vertex is the same no matter which path we take from one to the 
other. 

Kirchhoff 's Current Law: At each vertex which is not connected to the 
battery, the total current coming in is equal to the total current going 
out. 

We find the total current from i to j, and then use Ohm's Law to define 
the effective resistance Rij between i and j as the reciprocal of this current. 
It is a standard result of electrical network theory that the linear equations 
implicitly defined above for the currents and voltage differences have a unique 
solution. 

Let T be the set of treatments and Q the set of experimental units. 
Current flows in each edge Cacj, where a and u are experimental units in the 
same block which receive different treatments; let I{a, u) be the current from 
/(a) to f{uj) in this edge. Thus J is a function /: 1] x 1] t— > M such that 
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(a) I{a,uj) = if a = a; or if /(a) = f{uj) or if a and u are in different 
blocks. 

(b) I{a,ijj) = —I{ijj,a) for {a,u) m Q x Q. 
This defines a further function lout'- T ?■ M by 

-^out(0 = for / in T. 

a:f{a)=l ojGO 

Voltage is another function T i— )■ M. The following two conditions ensure 
that Ohm's and Kirchhoff 's Laws are satisfied. 

(c) If there is any edge in G between /(a) and /(w), then 

I{a,io) = V{f{a))-V{f{u;)). 

(d) If/^{z,j}, thenJout(/) = 0. 

If G is connected and different voltages V{i) and V{j) are given for a 
pair of distinct treatments i and j, then there are unique functions / and V 
satisfying conditions (a)-(d). Moreover, /out(j) = —Iout{i) 7^ 0. Then Rij is 
defined by 

_ V{z) - V{j) 

It can be shown that the value of Rij does not depend on the choice of values 
for V{i) and V{j), so long as these are different. In practical examples, it is 
usually convenient to take V{i) = and let / take integer values. 

What has all of this got to do with block designs? The following theorem, 
which is a standard result from electrical engineering, gives the answer. 

Theorem 6 IfL is the Laplacian matrix of a connected graph G, then the 
effective resistance Rij between vertices i and j is given by 

Comparing this with Theorem [2], we see that Vij = Rij x ka"^. Hence we 
have a test for A-optimality: 

A design is A-optimal if and only if its concurrence graph, re- 
garded as an electrical network, minimizes the sum of the pairwise 
effective resistances between all pairs of vertices. 
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Figure 8: The current between vertices i and j in a concurrence graph 

Effective resistances are easy to calculate without matrix inversion if the 
graph is sparse. 

Figure [8] shows the concurrence graph of a block design with w = 12, 
6 = 6 and = 3. Only vertices i and j are labelled. Otherwise, numbers 
beside arrows denote current and numbers in square brackets denote voltage. 
It is straightforward to check that conditions (a)-(d) are satisfied. Now 
V{i) - V{i) = 47 and /out(«) = 36, and so Rij = 47/36. Therefore Vij = 
(47/1 2) cr^. Moreover, for graphs consisting of b triangles arranged in a cycle 
like this, it is clear that average effective resistance, and hence the average 
pairwise variance, can be calculated as a function of b. 

6.2 The Levi graph as an electrical network 

The Levi graph G of a block design can also be considered as an electrical 
network. Denote by B the set of blocks. Now current is defined on the 
ordered edges of the Levi graph. Recall that, if u is an experimental unit 
in block F, then the edge e^j joins F to f{uj). Thus current is defined on 
{Q X B) U {B X Q) and voltage is defined on T U B. Conditions (a)-(d) in 
Section 16.11 need to be modified appropriately. 

The next theorem shows that a current- volt age pair (J, V) on the con- 
currence graph G can be transformed into a current-voltage pair (/, V) on 
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the Levi graph G. In (5, the current /(a, F) flows in edge from vertex 
/(a) to vertex F, where a G F. Hence the pairwisc variance Vij can also be 
calculated from the effective resistance Rij in the Levi graph. 

Theorem 7 Let G he the concurrence graph and G he the Levi graph of a 
connected block design with hlock size k. If i and j are two distinct treat- 
ments, let Rij and Rij be the effective resistance between vertices i and j in 
the electrical networks defined by G and G, respectively. Then Rij = kRij, 
and so Vij = Rija^. 

Proof Let (/, V) be a current-voltage pair on G. For (a, F) e O x put 

/>,F) = -J(F,a) = 

if q; e F; otherwise, put /(a, F) = /(F, a) = 0. Put V{{) = kV{{) for aU i 
in T, and 

v{r)^J2^{f{u)) 

for all F in B. It is clear that / satisfles the analogues of conditions (a) 
and (b). 

If a e F, then 

/>,F) = J2l{a,uj) 

= J2[V{f{a))-V{f{u;))] 

= kVifia))-V{T) = Vifia))-V{T), 

so the analogue of condition (c) is satisfied. 
If F e B, then 

/out(r) = «) = - E I] ^) = 0' 

because I{a, a) — and I{a,uj) — —I{uj, a). If Z e T then 

a:f{a)=l TgB a:f{a)=l wGfi 

In particular, /out(0 = if / ^ {hj}^ which shows that the analogue of 
condition (d) is satisfied. It follows that (/, V) is the current- volt age pair on 
G defined by V{i) and 
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Now 



Ri 



- VU) k{V{z) - V{j)) 



Then Theorems [2] and M show that Vi 



kR 



'out I, 



Rija'^. 



When k = 2 it seems to be easier to use the concurrence graph than the 
Levi graph, because it has fewer vertices, but for larger values of k the Levi 
graph may be better, as it does not have all the within-block cycles that the 
concurrence graph has. Fig. |9] gives the Levi graph of the block design whose 
concurrence graph is in Fig. [HI with the same two vertices i and j attached to 
the battery. This gives Rij = 47/12, which is in accordance with Theorem [71 
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Figure 9: Current between i and j for the Levi graph corresponding to the 
concurrence graph in Fig. [H] 

Here is another way of visualizing Theorem [3 From the block design we 
construct a graph Go with vertex-set T U Q U B: the edges are {a, F} for 
a eT E B and {a, /(a)} for a E Q. Let (Jq, Vq) be a current-voltage pair 
on Go ioT which both battery vertices are in T. We obtain the Levi graph 
G from Go by becoming blind to the vertices in Q. Thus the resistance in 
each edge of G is twice that in each edge in Go, so this step multiplies each 
effective resistance by 2. 

Because none of the battery vertices is in B, we can now obtain G from 
G by replacing each path of the form {i,T,j) by an edge There is 

no harm in scaling all the voltages by the same amount, so we can obtain 
(I,V) on G from (/, V^) on G by putting V{i) = V{i)/k for i in T, and 
J(a, uj) = V{f{a)) — V{f{u)) for a, u in the same block. If F is a block, then 

= J2 r) = J2l^{f{a)) - V{T)] = k J2 V{f{a)) - kV{T), 
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and so V{T) = ^^^p V{f{a)). Also, if a G T, then 



= kVifia))-J2yifi^)) 
= V{fia))-ViT) 

= /(«,r). 

Therefore, this transformation reverses the one used in the proof of Theo- 
rem [71 

There is yet another way of obtaining Theorem [71 If we use the re- 
sponses to estimate the block parameters /3r in (E]) as well as the treat- 
ment parameters Xj, then standard theory of linear models shows that, if the 
design is connected, then we can estimate linear combinations of the form 
Sr=i + X]j=i ^jl^j so long as ^ = ^ Zj. Moreover, the variance of the 
BLUE of this linear combination is 

[ ]C- 

and R is the diagonal matrix of replications. 

If we reparametrize equation by replacing (3j by —■jj for j = 1, . . . , 
b, then the estimable quantities are the contrasts in ti, . . . , r^,, 71, . . . , 7fe. 
The so-called information matrix C must be modified by multiplying the 
last b rows and the last b columns by —1: this gives precisely the Laplacian 
L of the Levi graph G. Just as for L, but unlike C, the null space is spanned 
by the all-1 vector. 

6.3 Spanning thickets 

We have seen that the value of the D-criterion is a function of the number 
of spanning trees of the concurrence graph G. It turns out that the closely 
related notion of a spanning thicket enables us to calculate the A-criterion; 
more precisely, the value of each pairwise effective resistance in G. 

A spanning thicket for the graph is a spanning subgraph that consists of 
two trees (one of them may be an isolated vertex). 

Theorem 8 If i and j are distinct vertices of G then 

^ number of spanning thickets with i, j in different parts 
'■^ number of spanning trees 
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This is also rather easy to calculate directly when the graph is sparse. 
Summing all the Rij and using Theorem |H] gives the following result from 

m- 

Theorem 9 If F is a spanning thicket for the concurrence graph G, denote 
by Fi and F2 the sets of vertices in its two trees. Then 

Yl 1^111^^21 

spanning thickets F 

number of spanning trees 

6.4 Random walks and electrical networks 

It was first pointed out by Kakutani in 1945 that there is a very close con- 
nection between random walks and electrical networks. In a simple random 
walk, a single step works as follows: starting at a vertex, we choose an edge 
containing the vertex at random, and move along it to the other end. This 
definition accommodates multiple edges, and is easily adapted to graphs with 
edge weights (where the probability of moving along an edge is proportional 
to the weight of the edge). 

If we are thinking of an edge-weighted graph as an electrical network, we 
take the weights to be the conductances of the edges (the reciprocals of the 
resistances). 

The connection is simple to state: 

Theorem 10 Let i and j be distinct vertices of the connected edge-weighted 
graph G. Apply voltages of 1 at i and at j . Then the voltage at a vertex I 
is equal to the probability that the random walk, starting at I, reaches i before 
it reaches j . 

From this theorem, it is possible to derive a formula for the effective 
resistance between two vertices. Here are two such formulas. Given two 
vertices i and j, let Pesc(^ — ^ j) be the probability that a random walk 
starting at i reaches j before returning to i; and let Si{i,j) be the expected 
number of times that a random walk starting at i visits i before reaching j. 
Then the effective resistance between i and j is given by either of the two 
expressions 

1 and ^i^^^j") 

diPesc{i ^ j) di 

where di is the degree of i. (If the edge resistances are not all 1, then the 
term di should be replaced by the sum of the reciprocals of the resistances 
of all edges incident with vertex i.) 
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The random walk approach gives alternative proofs of some of the main 
results about electrical networks. We discuss this further in the guide to the 
literature. 



6.5 Foster's formula and generalizations 

In 1948, Foster discovered that the sum of the effective resistances be- 
tween all adjacent pairs of vertices of a connected graph on v vertices is equal 
to f — 1. Thirteen years later, he found a similar formula for pairs of vertices 
at distance 2: 



Further extensions have been found, but require a stronger condition on the 
graph. The sum of resistances between all pairs of vertices at distance at 
most m can be written down explicitly if the graph is walk-regular up to 
distance m; this means that the number of closed walks of length k starting 
and finishing at a vertex i is independent of i, for k < m. The formula was 
discovered by Emil Vaughan, to whom this part of the chapter owes a debt. 

In particular, if the graph is distance-regular (see [H]), then the value 
of the A-criterion can be written down in terms of the so-called intersection 
array of the graph. 

6.6 Distance 

At first sight it seems obvious that pairwise variance should decrease as con- 
currence increases, but there are many counter-examples to this. However, 
the following theorem is proved in [3]. 

Theorem 11 If the Laplacian matrix L has precisely two distinct non-trivial 
eigenvalues, then pairwise variance is a decreasing linear function of concur- 
rence. 

It does appear that effective resistance, and hence pairwise variance, gen- 
erally increases with distance in the concurrence graph. In [71 Question 5.1] 
we pointed out that this is not always exactly so, and asked if it is nevertheless 
true that the maximal value of Rij is achieved for some pair of vertices {i,j} 
whose distance apart in the graph is maximal. Here is a counter-example. 

Example 2 Let /c = 2, so that the block design is the same as its concurrence 
graph. Take f = 10 and b = 14. The graph consists of a cube, with two 
extra vertices 1 and 2 attached as leaves to vertex 3. The vertex antipodal 
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to 3 in the cube is labelled 4. It is straightforward to check (either using an 
electrical network, or by using the fact that the cube is distance-regular) that 
the effective resistance between a pair of cube vertices is 7/12, 3/4 and 5/6 
for vertices at distances 1, 2 and 3. Hence Rij < 11/6 for all cube vertices j, 
while Ri2 = 2. On the other hand, the distance between vertices 1 and 2 is 
only 2, while that between either of them and vertex 4 is 4. 

There are some 'nice' graphs where pairwise variance does indeed increase 
with distance. The following result is proved in [B]. Biggs gave the equivalent 
result for effective resistances in [12]. 

Theorem 12 Suppose that a block design has just two distinct concurrences, 
and that the pairs of vertices corresponding to the larger concurrence form 
the edges of a distance-regular graph H . Then pairwise variance increases 
with distance in H. 

7 Graph concepts linked to E-optimality 

7.1 Measures of bottlenecks 

A 'good' graph (for use as a network) is one without bottlenecks: any set of 
vertices should have many edges joining it to its complement. So, for any 
subset S of vertices, we let d{S) (the boundary of S) be the set of edges 
which have one vertex in 5* and the other in its complement, and then define 
the isoperimetric number l{G) by 



The next result shows that the isoperimetric number is related to the E- 
criterion. It is useful not so much for identifying the E-optimal designs as for 
easily showing that large classes of designs cannot be E-optimal: any design 
whose concurrence graph has low isoperimetric number performs poorly on 
the E-criterion. 

Cutset Lemma 1 Let G have an edge-cutset of size c whose removal sep- 
arates the graph into parts S and G \ S with m and n vertices respectively, 
where < m < n. Then 



= mm 
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Proof We know that 9i is the minimum of x^Lx/x^x over real vectors x 
with Xi — 0. Put 

f n ii i E S 
* \ —m otherwise. 

Then x^x = nm{m + n) and 

x^Lx = [xi — XjY — c{m + nf. 

edges ij 

Hence 

x^Lx cim + nf fl 1\ 2c 2\dS\ 
c'l < — T — = ^ H = C h- <— = ' I ■ ■ 

x'x nm[m + n) \m n) m \b\ 

Corollary 1 Let 9i be the smallest non-trivial eigenvalue of the Laplacian 
matrix L of the connected graph G. Then 9i < 2l{G). 

There is also an upper bound for the isoperimetric number in terms of 
01, which is loosely referred to as a 'Cheeger-type inequality'; for details, see 
the further reading. 

We also require a second cutset lemma, phrased in terms of vertex cutsets. 

Cutset Lemma 2 Let G have a vertex-cutset C of size c whose removal 

separates the graph into parts S and T with m, n vertices respectively (so 
nm > Oj. Let m' and n' he the number of edges from vertices in C to vertices 
in S, T respectively. Then 

^1 - ? ] — T- 

nm[m + n) 

In particular, if there are no multiple edges at any vertex of C then 9i < c, 
with equality if and only if every vertex in C is joined to every vertex in 
SUT. 

Proof Put 

{n if i e 5" 
—m ii i E T 
otherwise. 

Then x^x = nm{m + n) and x^Lx = m'n^ + n'm^, and so 



01 < 



m'n^ + n'm? 



nm{m + n) 



If there are no multiple edges at any vertex in C then ml < cm and n' < cn 
and the result follows. ■ 
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7.2 Variance balance 



A block design is variance-balanced if all the concurrences Xij are equal for 
i ^ j- In such a design, all of the pairwise variances Vij are equal. Morgan 
and Srivastav proved the following result in [43]. 

Theorem 13 // the constant concurrence X of a variance-balanced design 
satisfies {v — 1)A = [{bk/v)\{k — 1) then the design is E-optimal. 

A block with k different treatments contributes k{k — l)/2 edges to the 
concurrence graph. Let us define the defect of a block to be 

k(k — 1) 

the number of edges it contributes to the graph. 

The following result is proved in [7]. 

Theorem 14 If k < v, then a variance-balanced design with v treatments is 
E-optimal if the sum of the block defects is less than v/2. 

Table [U^b) shows that the design in Fig. MJo) is variance-balanced. Block 
Fi has defect 1, and each other block has defect 0, so the sum of the block 
defects is certainly less than 5/2, and Theorem [H] shows that the design is E- 
optimal. It is rather counter-intuitive that the non-binary design in Fig. Et^b) 
can be better than the design in Fig. EJj^a); in fact, in his contribution to the 
discussion of Tocher's paper pT], which introduced this design, David Cox 
said 

I suspect that . . . balanced ternary designs are of no practical 
value. 

Computation shows that the design in Fig. [2](a) is $p-better than the one 
in Fig. MJo) if p < 5.327. In particular, it is A- and D-better. 



8 Some history 

As we have seen, if the experimental units form a single block and there are 
only two treatments then it is best for their replications to be as equal as 
possible. Statisticians know this so well that it is hard for us to imagine 
that more information may be obtained, about all treatment comparisons, if 
replications differ by more than 1. 

In agriculture, or in any area with qualitative treatments, A-optimality 
is the natural criterion. If treatments are quantities of different substances. 
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then D-optimality is preferable, as the ranking on this criterion is invariant 
to change of measurement units. Thus industrial statisticians have tended 
to prefer D-optimality, although E-optimality has become popular among 
chemical process engineers. Perhaps the different camps have not talked to 
each other as much as they should have. 

For most of the 20th century, it was normal practice in field experiments 
to have all treatments replicated three or four times. Where incomplete 
blocks were used, they typically had size from 3 to 20. Yates introduced his 
square lattice designs with v = k'^ in [55j. He used uniformity data and two 
worked examples to show that these designs can give lower average pairwise 
variance than a design using a highly replicated control treatment, but both 
of his examples were equireplicate with r G {3,4}. 

In the 1930s, 1940s and 1950s, analysis of the data from an experiment 
involved inverting the Laplacian matrix without a computer: this is easy 
for BIBDs, and only slightly harder if the Laplacian matrix has only two 
distinct non-trivial eigenvalues. The results in j35] and [38] encouraged the 
beliefs that the optimal designs, on all $p-criteria, are as equireplicate as 
possible, with concurrences as equal as possible, and that the same designs 
are optimal, or nearly so, on all of these criteria. 

Three short papers in the same journal in 1977-1982 demonstrate the 
beliefs at that time. In [29], John and Mitchell did not even consider designs 
with unequal replication. They conjectured that, if there exist any regular- 
graph designs for given values of v, b and k, then the A- and D-optimal designs 
are regular-graph designs. For the parameter sets which they had examined 
by computer search, the same designs were optimal on the A- and D-criteria. 
In [33], Jones and Eccleston reported the results of various computer searches 
for A-optimal designs without the constraint of equal replication. For k = 2 
and 6 = f G {10, 11, 12} (but not v = 9) their A-optimal design is almost 
a queen-bee design, and their designs are D- worse than those in [29]. The 
belief in equal replication was so ingrained that some readers assumed that 
there was an error in their program. 

John and Williams followed this with the paper [30] on conjectures for 
optimal block designs for given values of v, b and k. Their conjectures in- 
cluded: 

• the set of regular-graph designs always contains one that is optimal 
without this restriction; 

• among regular-graph designs, the same designs are optimal on the A- 
and D-criteria. 

They endorsed Cox's dismissal of non-binary designs, strengthening it to the 
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statement that they "are inefficient", and declared that the three unequally 
replicated A-optimal designs in [33] were "of academic rather than of prac- 
tical interest". These conjectures and opinions seemed quite reasonable to 
people who had been finding good designs for the sizes needed in agricultural 
experiments. 

At the end of the 20th century, there was an explosion in the number of 
experiments in genomics, using microarrays. Simplifying the story greatly, 
these are effectively block designs with k = 2, and biologists wanted A- 
optimal designs, but they did not know the vocabulary 'block' or 'A-optimal', 
'graph' or 'cycle'. Computers were now much more powerful than in 1980, 
and researchers in genomics could simply undertake computer searches with- 
out the beneffi of any statistical theory. In 2001, Kerr and Churchill [31] 
published the results of a computer search for A-optimal designs with k = 2 
and V = b < 11. For v G {10, 11, 12}, their results were completely consistent 
with those in [33], which they did not cite. They called cycles loop designs. 

Mainstream statisticians began to get involved. In 2005, Wit, Nobile and 
Khanin published the paper [S3] giving the results of a computer search for 
A- and D-optimal designs with k = 2 and v = b. The results are shown 
in Fig. [TOl The A-optimal designs differ from the D-optimal designs when 
V >9, but are consistent with those found in [34] . 

What is going on here? Why are the designs so different when f > 9? 
Why is there such a sudden, large change in the A-optimal designs? We 
explain this in the next section. 

9 Block size two 
9.1 Least replication 

If k = 2, then the design is the same as its concurrence graph, and connec- 
tivity requires that b > v — 1. If b = v — 1, then all connected designs are 
trees, such as those in Fig. [TTl Theorem [3] shows that the D-criterion does 
not differentiate between them. 

In a tree, the effective resistance Rij is just the length of the unique path 
between vertices i and j. Theorems [2] and M show that the only A-optimal 
designs are the stars, such as the graph on the right of Fig. [TT] 

In a star with v vertices, the contrast between any two leaves is an eigen- 
vector of the Laplacian matrix L with eigenvalue 1, while the contrast be- 
tween the central vertex and all the other vertices is an eigenvector with 
eigenvalue v. If f > 5 and G is not a star then there is an edge whose 
removal splits the graph into two components of sizes at least 2 and 3. Cut- 
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(a) D-optimal designs 




(b) A-optimal designs 
Figure 10: D-and A-optimal designs with k — 2 and ^ <v — h <11 




Figure 11: Two trees with w = 9, 6 = 8 and k = 2 
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set Lemma 1 then shows that 9i < 5/6 < 1. The only other tree which is 
not a star is the path of length 3, for which direct calculation shows that 
9i = 2 — \/2 < 1. Hence the E-optimal designs are also the stars. 



9.2 One fewer treatment 

lib = V and k = 2, then the concurrence graph G contains a single cycle: such 
graphs are called unicyclic. Let s be the length of the cycle. All the remaining 
vertices are in trees attached to various vertices of the cycle. Fig. [T2] shows 
two unicyclic graphs with v = 12 and s = 6. 





(a) 



(b) 



Figure 12: Two unicyclic graphs with b = v = 12 and s = 6 

As we remarked in Section [5l the number of spanning trees in a unicyclic 
graph is equal to the length of the cycle. Hence, Theorem[3]gives the following 
result. 

Theorem 15 If k = 2 and b = v > 3, then the D-optimal designs are 
precisely the cycles. 

For A-optimality, we first show that no graph like the one in Fig. [T2r a) 
can be optimal. If vertex 12 is moved so that it is joined to vertex 6, instead 
of vertex 1, then the sum of the variances Vi^u for i in the cycle is unchanged 
and the variances 12 for the remaining vertices i are all decreased. This 
argument shows that all the trees must be attached to the same vertex of 
the cycle. 
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Figure 13: Average pairwise variance, in a unicylic graph with v vertices, as 
a function of the length s of the cycle 

Now consider the tree on vertices 6, 8, 9, 10 and 11 in Fig. [T2](a). If the 
two edges incident with vertex 8 are modified to those in Fig. [T2r b). then the 
set of variances between these five vertices are unchanged, as are all others 
involving vertex 8, but those between vertices 9, 10, 11 and vertices outside 
this tree are all decreased. This argument shows that, for any given length s 
of the cycle, the only candidate for an A-optimal design has v — s leaves 
attached to a single vertex of the cycle. 

The effective resistance between a pair of vertices at distance d in a cycle 
of length s is d{s — d)/s, while that between a leaf and the cycle vertex to 
which it is attached is 1. Hence a short calculation shows that the sum of 
the pairwise effective resistances is equal to g{s) /12, where 

g{s) = -s^ + 2vs^ + 13s - I2sv + 12f 2 - Uv. 

Now V /a'^ = g{s)/[3v{v — 1)] and we seek the minimum of g{s) for integers 
s in the interval [2,v]. 

Fig. [13] plots g{s)/[3v{v — 1)] for s in [2,v] and 6 < f < 13. When v <7, 
the function g is monotonic decreasing, so it attains its minimum on [2,v] 
at s = V. For all larger values of v, the function g has a local minimum in 
the interval [3, 5]: when v > 9, the value at this local minimum is less than 
g{v). This change from the upper end of the interval to the local minimum 
explains the sudden change in the A-optimal designs. Detailed examination 
of the local minimum gives the following result. 
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Theorem 16 If k — 2 and b — v > 3 then the A-optimal designs are: 

• a cycle, if v < 8; 

• a square with v — 4 leaves attached to one vertex, if 9 < v < 11; 

• a triangle with v — A leaves attached to one vertex, if v > 13; 

• either of the last two, if v = 12. 

What about E-optimahty? The smallest eigenvalue of the Laplacian ma- 
trix of the triangle with one or more leaves attached to one vertex is 1, as 
is that of the digon with two or more leaves attached to one vertex. We 
now show that almost all other unicyclic graphs have at least one non-trivial 
eigenvalue smaller than this. 

Suppose that vertex i in the cycle has a non-empty tree attached to it, so 
that {i} is a vertex cutset. If s > 3 then there are no double edges, so Cutset 
Lemma 2 shows that 6i < 1 unless all vertices are joined to i, in which case 
s = 3. If s = 2 and there are trees attached to both vertices of the digon, 
then applying Cutset Lemma 2 at each of these vertices shows that < 1 
unless V — A and there is one leaf at each vertex of the digon: for this graph, 
9i = 2 — y/b < 1. A digon with leaves attached to one vertex is just a star 
with one edge doubled. 

The cycle of size v is a cyclic design. The smallest eigenvalue of its 
Laplacian matrix is 2(1 — cos(27r/t')), which is greater than 1 when v < 5, 
is equal to 1 when v — 6, and is less than 1 when v < 7. When v — 3 it 
is equal to 3, which is greater than 3 — y/3, which is the smallest Laplacian 
eigenvalue of the digon with one leaf. 

Putting all of this together proves the following result. 

Theorem 17 If k — 2 and b — v>3, then the E-optimal designs are: 

• a cycle, if v < 5; 

• a triangle with v — 3 leaves attached to one vertex, or a star with one 
edge doubled, if v > 7; 

• either of the last two, if v — 6. 

Thus, for V > 9, the ranking on the D-criterion is essentially the opposite 
of the ranking on the A- and E-criteria. The A- and E-optimal designs are 
far from equireplicate. The change is sudden, not gradual. These findings 
were initially quite shocking to statisticians. 
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9.3 More blocks 



What happens when b is larger than v but still has the same order of mag- 
nitude? The following theorems show that the A- and E-optimal designs are 
very different from the D-optimal designs when v is large. The proofs of 
Theorems [H] and [20] are in [1] and [7] respectively. 

Theorem 18 Let G be the concurrence graph of a connected block design A 
with k = 2 and b >v. If A is D-optimal then G does not contain any bridge 
(an edge cutset of size one): in particular, G contains no leaves. 

Proof Suppose that {i,j} is an edge-cutset for G. Let H and K be the 
parts of G containing i and j, respectively. 

Since G is not a tree, we may assume that H is not a tree, and so there is 
some edge e in H that is not in every spanning tree for H. Let ni and n2 be 
the numbers of spanning trees for H that include and exclude e, respectively, 
and let m be the number of spanning trees for K. Every spanning tree for G 
consists of spanning trees for H and K together with the edge {i,j}- Hence 
G has (rii + n2)m spanning trees. 

Let £ be a vertex on e with i ^ i. Form G' from G by removing edge e 
and inserting the edge e', where e' = {i,j}- 

Let T and T' be spanning trees for H and K respectively. If T does not 
contain e then T U {{i, j}} U T' and T U {e'} U T' are both spanning trees 
for G'. If T contains e then (T \ {e}) U j}} U {e'} U T' is a spanning tree 
for G'. Hence the number of spanning trees for G' is at least {2n2 + ni)m, 
which is greater than (rii + n2)m because n2 > 1. Hence G does not have 
the maximal number of spanning trees and so A is not D-optimal. ■ 

Theorem 19 Let c be a positive integer. Then there is a positive integer 
Vc such that if b — v = c and v > then all A-optimal designs with k = 2 
contain leaves. 

Theorem 20 If 20 < v < b < 5v/ A then the concurrence graph for any 
E-optimal design with k = 2 contains leaves. 

Of course, to obtain a BIBD when k = 2, b needs to be a quadratic 
function of v. What happens if b is merely a linear function of f? In [7] we 
conjectured that ifb = cv for some constant c then there is a threshold result 
like the one in Theorem [T2J However, current work by Robert Johnson and 
Mark Walters [52] suggests something much more interesting — that there is 
a constant G with 3 < C < 4 such that if 6 > Gv and k = 2 then all 
A-optimal designs are (nearly) equireplicate, and that random such graphs 
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(in a suitable model) are close to A-optimal with high probability. On the 
other hand if 6 < Cv then a graph consisting of a large almost equireplicate 
part (all degrees 3 and 4 with average degree close to Cv) together with a 
suitable number of leaves joined to a single vertex is strictly better than any 
queen-bee design. 

9.4 A little more history 

The results on D- and A-optimality in Sections 19.11 and 19.21 were proved in 
partly to put to rest mutterings that the results of [331 EH IS3] found by 
computer search were incorrect. The results on E-optimality are in [7]. 

In spite of the horror with which these results were greeted, it transpired 
that they were not new. The D- and E-optimal designs for 6 = (w — 1)/(A; — 1) 
were identified in |llj in 1991. The A-optimal designs for k = 2 and b = v — 1 
had been given in [41j in 1991. Also in 1991, Tjur gave the A-optimal designs 
for k = 2 and b = v in [52]: his proof used the Levi graph as an electrical 
network. 

A fairly common response to these unexpected results was Tt seems to 
be just block size 2 that is a problem.' Perhaps those of us who usually 
deal with larger blocks had simply not thought that it was worth while to 
investigate block size 2 before the introduction of microarrays. 

However, as we sketch in the next section, the problem is not block size 2 
but very low average replication. The proofs there are similar to those in 
this section; they are given in more detail in [HI |36] . Once again, it turns out 
that these results are not all new. The D-optimal designs for v/{k — l) blocks 
of size k were given by Balasubramanian and Dey in [10] in 1996 — but their 
proof uses a version of Theorem [S] with the wrong value of the constant. The 
A-optimal designs for v/{k — l) blocks of size k were published by Krafft and 
Schaefer in [37] in 1997 — but those authors are not blameless either, because 
they apparently had not read \52\ \ 

Our best explanation is that agricultural statisticians are so familiar with 
average replication being at least 3 that when we saw these papers we decided 
that they had no applicability and so forgot them. 

10 Very low average replication 

In this section we once again consider general block size k. A block design 
is connected if and only if its Levi graph is connected. The Levi graph has 
V + b vertices and bk edges, so connectivity implies that bk > b + v — 1; that 
is, b{k-l) >v - 1. 
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10.1 Least replication 

If b{k — 1) = v — 1 and the design is connected, then the Levi graph G is a tree 
and the concurrence graph G looks hke those in Fig. [51 Hypergraph-theorists 
do not seem to have an agreed name for such designs. 

For both D- and A-optimahty, it turns out to be convenient to use the 
Levi graph. Since all the Levi graphs are trees, Theorem O shows that the 
D-criterion does not distinguish among connected designs. 

By Theorem [71 Vij = Rija^. When G is a tree, Rij = 2 when i and j 
are in the same block; otherwise, Rij = 4 if any block containing i has a 
treatment in common with any block containing j; and otherwise, Rij > 6. 
The queen-bee designs are the only ones for which Rij < 4 for all i and j, 
and so they are the A-optimal designs. 

The non-trivial eigenvalues of a queen-bee design are 1, k and v, with 
multiplicities 6 — 1, b{k — 2) and 1, respectively. If the design is not a queen- 
bee design, then there is a treatment i that is in more than one block but 
not in all blocks. Thus vertex i forms a cutset for the concurrence graph G 
which is not joined to every other vertex of G. Cutset Lemma 2 shows that 
9i < 1. Hence the E-optimal designs are also the queen-bee designs. 

10.2 One fewer treatment 

If b{k — 1) = f , then the Levi graph G has bk edges and bk vertices, and so it 
contains a single cycle, which must be of some even length 2s. If 2 < s < 6, 
then the design is binary; if s = 1, then there is a single non-binary block, 
whose defect is 1. In this case, k > 3, because each block must have more 
than one treatment. 

For 2 < s < 6, let C{b, k, s) be the class of designs constructed as follows. 
Start with a loop design for s treatments. Insert k — 2 extra treatments into 
each block. The remaining b — s blocks all contain the same treatment from 
the loop design, together with k — 1 extra treatments. Figs. [3 and M show 
the concurrence graph and Levi graph, respectively, of a design in C(6, 3, 6). 

For k > 4, the designs in C{b, k, 1) have one treatment which occurs twice 
in one block and once in all other blocks, with the remaining treatments all 
replicated once. The class C{b, 3, 1) contains all such designs, and also those 
in which the treatment in every block is the one which occurs only once in 
the non-binary block. 

Theorem 21 // b{k — 1) = v, then the D-optimal designs are those in 
C{b,k,b). 
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Proof The Levi graph G is unicychc, so its number of spanning trees is 
maximized when the cycle has maximal length. Theorem [S] shows that the 
D-optimal designs are precisely those with s = b. u 

Theorem 22 Ifb{k—1) = v then the A-optimal designs are those in C{b, k, s), 
where the value of s is given in Table\M 



k 


b 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


2 




2 


3 


4 


5 


6 


7 


8 


4 


4 
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3 or 4 


3 


3 




2 


3 


4 


5 
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3 


3 


3 


3 


2 


2 


4 
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3 


4 


5 


3 


2 


2 


2 


2 


2 


2 


2 


5 




2 


3 


4 


5 


2 


2 


2 


2 


2 


2 


2 


2 


6 




2 


3 


4 


2 


2 


2 


2 


2 


2 


2 


2 


2 



Table 2: Value of s for A-optimal designs for b{k — 1) treatments in b blocks 
of size k: see Theorem [2^ 



Proof The Levi graph G has one cycle, whose length is 2s, where 1 < s < 6. 
A similar argument to the one used at the start of the proof of Theorem [TBI 
shows that this cannot be A-optimal unless the design is in C{b,k,s). If 
s > 2 or A; > 4, then each block-vertex in the cycle has k — 2 treatment- 
vertices attached as leaves; all other block- vert ices are joined to the same 
single treatment-vertex in the cycle, and each has k — 1 treatment vertices 
attached as leaves. In C{b, 3, 1) the first type of design has a Levi graph 
like this, and the other type has the same multiset of effective resistances 
between treatment-vertices, because their concurrence graphs are identical. 
The following calculations use the first type. 

Let Vi be the set of treatment-vertices in the cycle, V2 the set of other 
treatment-vertices joined to blocks in the cycle, and V3 the set of remaining 
treatment-vertices. For 1 < i < j < 3, denote by TZij the sum of the pairwise 
resistances between vertices in Vi and Vj. 



Put 



^ 2d{2s - 2d) 



2s 

d=l 



and 



^{2d + l){2s-2d-l) 2s2 + l 
^^-^ 2; 

d=0 
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Then 7^n = si?i/2, 7^l2 = s{k - 2)(i?2 + s), 7^l3 = (6 - s)(A; - + 2s), 

7^22 = s{k -2){k-3) + s{k - 2f[Ri + 2{s - l)]/2, 7^23 = {b-s){k- l){k - 
2)(i?2 + 3s), andTess = {b-s){k-l){k-2)+2{b-s){b-s-l){k-lf. Hence 
the sum of the pairwise effective resistances between treatment- vertices in the 
Levi graph is g{s)/6, where 

g(s) = -{k - 1)^5^ + 2b{k - 1)^52 - [6bk{k - 1) - 4fc^ + 2fc - l]s + c 

and c = b{k- 1)[126(A; - 1) - 5A; - 4]. 

If s = 1 then the design is non-binary. However, 

g{l) - g{2) = {3k - 9 + Qb){k -1) ~3, 

which is positive, because k >2 and b > 2. Therefore the non-binary designs 
are never A-optimal. 

Direct calculation shows that 5^(2) > g{3) when 6 = 3, and that g{2) > 
g{3) > (7(4) when 6 = 4. These inequalities hold for all values of k, even 
though g is not decreasing on the interval [2, 4] for large k when 6 = 4. 

If 6 = 5 and k > 6, then g{3) > g{2) and g{5) > g{2). Thus the local 
minimum of g occurs in the interval (1,3) and is the overall minimum of g 
on the interval [1,5]. 

Differentiation gives 

^'(6) = b{k - 1)[(6 - 6){k - 1) - 6] + 4P - 2A; + 1. 

If g'{b) > then g has a local minimum in the interval (1, 6). If, in addition, 
g{3) > g(2), then the minimal value for integer s occurs at s = 2. These 
conditions are both satisfied if /c = 3 and 6 > 12, = 4 and 6 > 8, A; > 5 and 
6 > 7, or > 9 and 6 > 6. 

Given Theorem there remain only a finite number of pairs (6, k) to 
be checked individually to find the smallest value of g{s). The results are in 
Table E) ■ 

Theorem 23 If b{k — l) = v,b>3 and k > 3, then the E-optimal designs 
are those in C{b, k, 6) if b < 4, and those in C{b, k, 2) and C{b, k, 1) if b > 5. 

Proof If 2 < s < 6 then the concurrence graph G has a vertex which forms 
a vertex-cutset and which is not joined to all other vertices; moreover, G has 
no multiple edges. Thus Cutset Lemma 2 shows that ^1 < 1. 

Direct calculation shows that ^1 = 1 if s = 1 or s = 2. For k > A, all 
contrasts between singly replicated treatments in the same block are eigen- 
vectors of the Laplacian matrix L with eigenvalue k. When A; > 3 and s = b 
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the contrast between singly and doubly replicated treatments has eigenvalue 
2{k — 1). For s = b, a. straightforward calculation shows that the remaining 
eigenvalues of L are 

k-cosl-^ \ ± W(A;- l)2-sin2 f -^J 
for 1 < ri < 6 — 1 . The smallest of these is 



k - cos(27r/6) - ^ {k - - sm^{2n/b) : 
this is greater than 1 if 6 = 3 or 6 = 4, but less than 1 if /c > 3 and 6 > 5. ■ 

11 Further reading 

The Laplacian matrix of a graph, and its eigenvalues, are widely used, espe- 
cially in connection with network properties such as connectivity, expansion, 
and random walks. A good introduction to this material can be found in the 
textbook by Bollobas [13], especially Chapters II (electrical networks) and 
IX (random walks). Connection between the smallest non-zero eigenvalue 
and connectivity is described in surveys by de Abreu [1] and by Mohar [12]. 
In this terminology, a version of Theorem [T7] is in [21] . 

The basic properties of electrical networks can be found in textbooks of 
electrical engineering, for example Balabanian and Bickart [QJ. A treatment 
connected to the multivariate Tutte polynomial appears in Sokal's survey [50] . 
Bollobas describes several approaches to the theory, including the fact (which 
we have not used) that the current flow minimises the power consumed in 
the network, and explains the interactions between electrical networks and 
random walks in the network. See also Deo [22] . 

The connection with optimal design theory was discussed in detail by the 
authors in their survey [7]. Further reading on optimal design can be found 
in John and Williams [HT] , Schwabe [17] , or Shah and Sinha [H] . For general 
principles of experimental design, see Bailey [5]. 

Acknowledgement This chapter was written at the Isaac Newton Insti- 
tute for Mathematical Sciences, Cambridge, UK, during the 2011 programme 
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